Overview

Dataset statistics

Number of variables9
Number of observations1030
Missing cells0
Missing cells (%)0.0%
Duplicate rows11
Duplicate rows (%)1.1%
Total size in memory72.5 KiB
Average record size in memory72.1 B

Variable types

Numeric9

Alerts

Dataset has 11 (1.1%) duplicate rowsDuplicates
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture)High correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture)High correlation
Age (day) is highly correlated with Concrete compressive strength(MPa, megapascals) High correlation
Concrete compressive strength(MPa, megapascals) is highly correlated with Age (day)High correlation
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture)High correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture)High correlation
Water (component 4)(kg in a m^3 mixture) is highly correlated with Superplasticizer (component 5)(kg in a m^3 mixture)High correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Water (component 4)(kg in a m^3 mixture)High correlation
Cement (component 1)(kg in a m^3 mixture) is highly correlated with Blast Furnace Slag (component 2)(kg in a m^3 mixture) and 6 other fieldsHigh correlation
Blast Furnace Slag (component 2)(kg in a m^3 mixture) is highly correlated with Cement (component 1)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Fly Ash (component 3)(kg in a m^3 mixture) is highly correlated with Cement (component 1)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Water (component 4)(kg in a m^3 mixture) is highly correlated with Cement (component 1)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Superplasticizer (component 5)(kg in a m^3 mixture) is highly correlated with Cement (component 1)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Coarse Aggregate (component 6)(kg in a m^3 mixture) is highly correlated with Cement (component 1)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Fine Aggregate (component 7)(kg in a m^3 mixture) is highly correlated with Cement (component 1)(kg in a m^3 mixture) and 5 other fieldsHigh correlation
Concrete compressive strength(MPa, megapascals) is highly correlated with Cement (component 1)(kg in a m^3 mixture)High correlation
Blast Furnace Slag (component 2)(kg in a m^3 mixture) has 466 (45.2%) zeros Zeros
Fly Ash (component 3)(kg in a m^3 mixture) has 566 (55.0%) zeros Zeros
Superplasticizer (component 5)(kg in a m^3 mixture) has 379 (36.8%) zeros Zeros

Reproduction

Analysis started2023-07-13 04:28:19.018554
Analysis finished2023-07-13 04:28:37.728760
Duration18.71 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Cement (component 1)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct280
Distinct (%)27.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean281.1656311
Minimum102
Maximum540
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:37.852453image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile143.745
Q1192.375
median272.9
Q3350
95-th percentile480
Maximum540
Range438
Interquartile range (IQR)157.625

Descriptive statistics

Standard deviation104.5071416
Coefficient of variation (CV)0.3716924478
Kurtosis-0.5206632839
Mean281.1656311
Median Absolute Deviation (MAD)79.4
Skewness0.5095174326
Sum289600.6
Variance10921.74265
MonotonicityNot monotonic
2023-07-12T21:28:38.068801image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42520
 
1.9%
362.620
 
1.9%
251.3715
 
1.5%
44614
 
1.4%
31014
 
1.4%
33113
 
1.3%
25013
 
1.3%
47513
 
1.3%
38712
 
1.2%
34912
 
1.2%
Other values (270)884
85.8%
ValueCountFrequency (%)
1024
0.4%
108.34
0.4%
1164
0.4%
122.64
0.4%
1322
 
0.2%
1335
0.5%
133.11
 
0.1%
134.71
 
0.1%
1352
 
0.2%
135.72
 
0.2%
ValueCountFrequency (%)
5409
0.9%
531.35
0.5%
5281
 
0.1%
5257
0.7%
5222
 
0.2%
5202
 
0.2%
5162
 
0.2%
5051
 
0.1%
500.11
 
0.1%
50010
1.0%

Blast Furnace Slag (component 2)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct187
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.89548544
Minimum0
Maximum359.4
Zeros466
Zeros (%)45.2%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:38.237923image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median22
Q3142.95
95-th percentile236
Maximum359.4
Range359.4
Interquartile range (IQR)142.95

Descriptive statistics

Standard deviation86.27910364
Coefficient of variation (CV)1.167582879
Kurtosis-0.5081392049
Mean73.89548544
Median Absolute Deviation (MAD)22
Skewness0.8007373534
Sum76112.35
Variance7444.083725
MonotonicityNot monotonic
2023-07-12T21:28:38.402974image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0466
45.2%
18930
 
2.9%
106.320
 
1.9%
2414
 
1.4%
2012
 
1.2%
14511
 
1.1%
1910
 
1.0%
228
 
0.8%
268
 
0.8%
1907
 
0.7%
Other values (177)444
43.1%
ValueCountFrequency (%)
0466
45.2%
0.025
 
0.5%
114
 
0.4%
13.615
 
0.5%
155
 
0.5%
17.21
 
0.1%
17.51
 
0.1%
17.61
 
0.1%
1910
 
1.0%
2012
 
1.2%
ValueCountFrequency (%)
359.42
 
0.2%
342.12
 
0.2%
316.12
 
0.2%
305.34
0.4%
290.22
 
0.2%
2884
0.4%
282.84
0.4%
272.82
 
0.2%
262.25
0.5%
2601
 
0.1%

Fly Ash (component 3)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct163
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.18713592
Minimum0
Maximum200.1
Zeros566
Zeros (%)55.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:38.575690image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3118.27
95-th percentile167.0055
Maximum200.1
Range200.1
Interquartile range (IQR)118.27

Descriptive statistics

Standard deviation63.99646938
Coefficient of variation (CV)1.181026978
Kurtosis-1.328504785
Mean54.18713592
Median Absolute Deviation (MAD)0
Skewness0.5374451101
Sum55812.75
Variance4095.548093
MonotonicityNot monotonic
2023-07-12T21:28:38.750715image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0566
55.0%
14116
 
1.6%
118.2715
 
1.5%
7914
 
1.4%
9413
 
1.3%
174.2410
 
1.0%
98.7510
 
1.0%
95.6910
 
1.0%
125.1810
 
1.0%
121.6210
 
1.0%
Other values (153)356
34.6%
ValueCountFrequency (%)
0566
55.0%
24.465
 
0.5%
24.515
 
0.5%
24.525
 
0.5%
591
 
0.1%
601
 
0.1%
711
 
0.1%
71.51
 
0.1%
75.61
 
0.1%
761
 
0.1%
ValueCountFrequency (%)
200.11
 
0.1%
2001
 
0.1%
1953
0.3%
194.91
 
0.1%
1941
 
0.1%
1931
 
0.1%
1901
 
0.1%
1871
 
0.1%
185.31
 
0.1%
1852
0.2%

Water (component 4)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct205
Distinct (%)19.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean181.5663592
Minimum121.75
Maximum247
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:38.962284image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum121.75
5-th percentile146.14
Q1164.9
median185
Q3192
95-th percentile228
Maximum247
Range125.25
Interquartile range (IQR)27.1

Descriptive statistics

Standard deviation21.35556707
Coefficient of variation (CV)0.1176185234
Kurtosis0.1226763387
Mean181.5663592
Median Absolute Deviation (MAD)13
Skewness0.07432397542
Sum187013.35
Variance456.0602447
MonotonicityNot monotonic
2023-07-12T21:28:39.148046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
192118
 
11.5%
22854
 
5.2%
185.746
 
4.5%
203.536
 
3.5%
18628
 
2.7%
16220
 
1.9%
164.920
 
1.9%
18515
 
1.5%
153.515
 
1.5%
20014
 
1.4%
Other values (195)664
64.5%
ValueCountFrequency (%)
121.755
0.5%
126.65
0.5%
1271
 
0.1%
127.31
 
0.1%
137.85
0.5%
1401
 
0.1%
140.755
0.5%
141.85
0.5%
1421
 
0.1%
143.35
0.5%
ValueCountFrequency (%)
2471
 
0.1%
246.91
 
0.1%
2371
 
0.1%
236.71
 
0.1%
22854
5.2%
221.41
 
0.1%
2212
 
0.2%
220.11
 
0.1%
2202
 
0.2%
219.71
 
0.1%

Superplasticizer (component 5)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct155
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.20311165
Minimum0
Maximum32.2
Zeros379
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:39.537808image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median6.35
Q310.16
95-th percentile16.055
Maximum32.2
Range32.2
Interquartile range (IQR)10.16

Descriptive statistics

Standard deviation5.973491651
Coefficient of variation (CV)0.9629830942
Kurtosis1.413185653
Mean6.20311165
Median Absolute Deviation (MAD)5.31
Skewness0.9081127315
Sum6389.205
Variance35.6826025
MonotonicityNot monotonic
2023-07-12T21:28:39.820606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0379
36.8%
827
 
2.6%
11.621
 
2.0%
719
 
1.8%
617
 
1.7%
915
 
1.5%
16.515
 
1.5%
1015
 
1.5%
1114
 
1.4%
5.7510
 
1.0%
Other values (145)498
48.3%
ValueCountFrequency (%)
0379
36.8%
1.724
 
0.4%
1.91
 
0.1%
21
 
0.1%
2.21
 
0.1%
2.52
 
0.2%
36
 
0.6%
3.11
 
0.1%
3.43
 
0.3%
3.575
 
0.5%
ValueCountFrequency (%)
32.25
0.5%
28.25
0.5%
23.45
0.5%
22.11
 
0.1%
226
0.6%
20.81
 
0.1%
201
 
0.1%
191
 
0.1%
18.81
 
0.1%
18.65
0.5%

Coarse Aggregate (component 6)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct284
Distinct (%)27.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean972.9185922
Minimum801
Maximum1145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:40.021991image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum801
5-th percentile842
Q1932
median968
Q31029.4
95-th percentile1104
Maximum1145
Range344
Interquartile range (IQR)97.4

Descriptive statistics

Standard deviation77.75381809
Coefficient of variation (CV)0.0799181131
Kurtosis-0.599000555
Mean972.9185922
Median Absolute Deviation (MAD)46.3
Skewness-0.04020640267
Sum1002106.15
Variance6045.656228
MonotonicityNot monotonic
2023-07-12T21:28:40.188461image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
93257
 
5.5%
852.145
 
4.4%
944.730
 
2.9%
96829
 
2.8%
112524
 
2.3%
104719
 
1.8%
96719
 
1.8%
97412
 
1.2%
94212
 
1.2%
93812
 
1.2%
Other values (274)771
74.9%
ValueCountFrequency (%)
8014
0.4%
801.11
 
0.1%
801.41
 
0.1%
8112
0.2%
8141
 
0.1%
814.11
 
0.1%
817.91
 
0.1%
8181
 
0.1%
8192
0.2%
819.21
 
0.1%
ValueCountFrequency (%)
11451
 
0.1%
1134.35
 
0.5%
11301
 
0.1%
112524
2.3%
1124.42
 
0.2%
11202
 
0.2%
11192
 
0.2%
1118.82
 
0.2%
11181
 
0.1%
11132
 
0.2%

Fine Aggregate (component 7)(kg in a m^3 mixture)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct304
Distinct (%)29.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean773.5788835
Minimum594
Maximum992.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:40.380550image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum594
5-th percentile613
Q1730.95
median779.51
Q3824
95-th percentile898.068
Maximum992.6
Range398.6
Interquartile range (IQR)93.05

Descriptive statistics

Standard deviation80.1754274
Coefficient of variation (CV)0.103642213
Kurtosis-0.1021647727
Mean773.5788835
Median Absolute Deviation (MAD)45.49
Skewness-0.2529792974
Sum796786.25
Variance6428.099159
MonotonicityNot monotonic
2023-07-12T21:28:40.565663image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
755.830
 
2.9%
59430
 
2.9%
67023
 
2.2%
61322
 
2.1%
80116
 
1.6%
746.615
 
1.5%
887.115
 
1.5%
84514
 
1.4%
71214
 
1.4%
75012
 
1.2%
Other values (294)839
81.5%
ValueCountFrequency (%)
59430
2.9%
6055
 
0.5%
611.85
 
0.5%
6121
 
0.1%
61322
2.1%
613.22
 
0.2%
6141
 
0.1%
6232
 
0.2%
6305
 
0.5%
6314
 
0.4%
ValueCountFrequency (%)
992.65
0.5%
9454
0.4%
943.14
0.4%
9424
0.4%
925.75
0.5%
905.95
0.5%
903.795
0.5%
903.595
0.5%
901.85
0.5%
900.95
0.5%

Age (day)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45.66213592
Minimum1
Maximum365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:40.755336image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q17
median28
Q356
95-th percentile180
Maximum365
Range364
Interquartile range (IQR)49

Descriptive statistics

Standard deviation63.16991158
Coefficient of variation (CV)1.383419989
Kurtosis12.16898898
Mean45.66213592
Median Absolute Deviation (MAD)21
Skewness3.269177401
Sum47032
Variance3990.437729
MonotonicityNot monotonic
2023-07-12T21:28:40.886480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
28425
41.3%
3134
 
13.0%
7126
 
12.2%
5691
 
8.8%
1462
 
6.0%
9054
 
5.2%
10052
 
5.0%
18026
 
2.5%
9122
 
2.1%
36514
 
1.4%
Other values (4)24
 
2.3%
ValueCountFrequency (%)
12
 
0.2%
3134
 
13.0%
7126
 
12.2%
1462
 
6.0%
28425
41.3%
5691
 
8.8%
9054
 
5.2%
9122
 
2.1%
10052
 
5.0%
1203
 
0.3%
ValueCountFrequency (%)
36514
 
1.4%
3606
 
0.6%
27013
 
1.3%
18026
 
2.5%
1203
 
0.3%
10052
 
5.0%
9122
 
2.1%
9054
 
5.2%
5691
 
8.8%
28425
41.3%

Concrete compressive strength(MPa, megapascals)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct938
Distinct (%)91.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.81783583
Minimum2.331807832
Maximum82.5992248
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.2 KiB
2023-07-12T21:28:41.029636image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum2.331807832
5-th percentile10.95942786
Q123.70711515
median34.44277358
Q346.13628654
95-th percentile66.8045116
Maximum82.5992248
Range80.26741697
Interquartile range (IQR)22.42917139

Descriptive statistics

Standard deviation16.70567917
Coefficient of variation (CV)0.4664067158
Kurtosis-0.3138436917
Mean35.81783583
Median Absolute Deviation (MAD)10.9281946
Skewness0.4169222823
Sum36892.3709
Variance279.0797167
MonotonicityNot monotonic
2023-07-12T21:28:41.174695image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.398217445
 
0.5%
77.297154364
 
0.4%
31.350473724
 
0.4%
71.298713164
 
0.4%
35.30117124
 
0.4%
79.296634764
 
0.4%
55.895819323
 
0.3%
17.540269443
 
0.3%
18.126324043
 
0.3%
65.196850563
 
0.3%
Other values (928)993
96.4%
ValueCountFrequency (%)
2.3318078321
0.1%
3.319826941
0.1%
4.5650205961
0.1%
4.7822055361
0.1%
4.8277109521
0.1%
4.9035533121
0.1%
6.267336841
0.1%
6.2804368841
0.1%
6.467284881
0.1%
6.80857551
0.1%
ValueCountFrequency (%)
82.59922481
 
0.1%
81.751169321
 
0.1%
80.199848321
 
0.1%
79.986110761
 
0.1%
79.400056161
 
0.1%
79.296634764
0.4%
78.800212041
 
0.1%
77.297154364
0.4%
76.800731641
 
0.1%
76.235361321
 
0.1%

Interactions

2023-07-12T21:28:35.533273image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:21.932123image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:23.780159image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:25.233523image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:27.016746image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:28.547626image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:30.265637image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:31.805577image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:33.947810image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:35.690648image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:22.397586image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:23.924766image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:25.456241image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:27.168178image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:28.676133image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:30.442323image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:31.984479image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:34.131799image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:35.853426image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:22.553491image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:24.071395image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:25.700349image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:27.353175image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:28.916062image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:30.655044image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:32.179524image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:34.318808image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:36.029736image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:22.758205image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:24.243515image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:25.907589image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:27.537664image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:29.154200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:30.828333image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:32.373823image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:34.516600image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:36.181324image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:22.922310image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:24.399675image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:26.063542image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:27.698149image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:29.311945image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:30.987488image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:32.550773image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:34.670699image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:36.357339image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:23.068484image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:24.542139image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:26.233660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:27.863483image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:29.507668image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:31.128123image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:32.791232image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:34.860541image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:36.531536image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:23.224313image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:24.698053image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:26.395734image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:28.040025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:29.678394image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:31.295724image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:33.104033image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:35.034868image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:36.792681image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:23.460188image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:24.902075image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:26.675253image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:28.213639image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:29.950229image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:31.457574image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:33.581767image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:35.212799image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:36.955947image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:23.616264image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:25.054579image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:26.824359image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:28.381486image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:30.087386image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:31.625654image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:33.754893image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2023-07-12T21:28:35.368647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2023-07-12T21:28:41.353914image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-07-12T21:28:41.680908image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-07-12T21:28:42.068134image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-07-12T21:28:42.394731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-07-12T21:28:37.235873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-12T21:28:37.560486image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Cement (component 1)(kg in a m^3 mixture)Blast Furnace Slag (component 2)(kg in a m^3 mixture)Fly Ash (component 3)(kg in a m^3 mixture)Water (component 4)(kg in a m^3 mixture)Superplasticizer (component 5)(kg in a m^3 mixture)Coarse Aggregate (component 6)(kg in a m^3 mixture)Fine Aggregate (component 7)(kg in a m^3 mixture)Age (day)Concrete compressive strength(MPa, megapascals)
0540.0000.0000.000162.0002.5001040.000676.0002879.986
1540.0000.0000.000162.0002.5001055.000676.0002861.887
2332.500142.5000.000228.0000.000932.000594.00027040.270
3332.500142.5000.000228.0000.000932.000594.00036541.053
4198.600132.4000.000192.0000.000978.400825.50036044.296
5266.000114.0000.000228.0000.000932.000670.0009047.030
6380.00095.0000.000228.0000.000932.000594.00036543.698
7380.00095.0000.000228.0000.000932.000594.0002836.448
8266.000114.0000.000228.0000.000932.000670.0002845.854
9475.0000.0000.000228.0000.000932.000594.0002839.290

Last rows

Cement (component 1)(kg in a m^3 mixture)Blast Furnace Slag (component 2)(kg in a m^3 mixture)Fly Ash (component 3)(kg in a m^3 mixture)Water (component 4)(kg in a m^3 mixture)Superplasticizer (component 5)(kg in a m^3 mixture)Coarse Aggregate (component 6)(kg in a m^3 mixture)Fine Aggregate (component 7)(kg in a m^3 mixture)Age (day)Concrete compressive strength(MPa, megapascals)
1020288.400121.0000.000177.4007.000907.900829.5002842.140
1021298.2000.000107.000209.70011.100879.600744.2002831.875
1022264.500111.00086.500195.5005.900832.600790.4002841.542
1023159.800250.0000.000168.40012.2001049.300688.2002839.456
1024166.000259.7000.000183.20012.700858.800826.8002837.917
1025276.400116.00090.300179.6008.900870.100768.3002844.284
1026322.2000.000115.600196.00010.400817.900813.4002831.179
1027148.500139.400108.600192.7006.100892.400780.0002823.697
1028159.100186.7000.000175.60011.300989.600788.9002832.768
1029260.900100.50078.300200.6008.600864.500761.5002832.401

Duplicate rows

Most frequently occurring

Cement (component 1)(kg in a m^3 mixture)Blast Furnace Slag (component 2)(kg in a m^3 mixture)Fly Ash (component 3)(kg in a m^3 mixture)Water (component 4)(kg in a m^3 mixture)Superplasticizer (component 5)(kg in a m^3 mixture)Coarse Aggregate (component 6)(kg in a m^3 mixture)Fine Aggregate (component 7)(kg in a m^3 mixture)Age (day)Concrete compressive strength(MPa, megapascals)# duplicates
1362.600189.0000.000164.90011.600944.700755.800335.3014
3362.600189.0000.000164.90011.600944.700755.8002871.2994
4362.600189.0000.000164.90011.600944.700755.8005677.2974
5362.600189.0000.000164.90011.600944.700755.8009179.2974
2362.600189.0000.000164.90011.600944.700755.800755.8963
6425.000106.3000.000153.50016.500852.100887.100333.3983
7425.000106.3000.000153.50016.500852.100887.100749.2013
8425.000106.3000.000153.50016.500852.100887.1002860.2953
9425.000106.3000.000153.50016.500852.100887.1005664.3013
10425.000106.3000.000153.50016.500852.100887.1009165.1973